Skip to content

Refactor BaseTableDataManager #18381

Open
krishan1390 wants to merge 15 commits into
apache:masterfrom
krishan1390:basetdm_refactor
Open

Refactor BaseTableDataManager #18381
krishan1390 wants to merge 15 commits into
apache:masterfrom
krishan1390:basetdm_refactor

Conversation

@krishan1390
Copy link
Copy Markdown
Contributor

@krishan1390 krishan1390 commented Apr 30, 2026

Summary

Small refactors to BaseTableDataManager (and its tests) to enable cleaner extensions, abstractions, and ownership for future work.

1. Push _segmentReloadSemaphore acquire/release into reloadSegment(SegmentDataManager, ILC, boolean)

Removes the duplicated outer acquire/release in the reloadSegment(String) and parallel reloadSegments(List<SDM>) paths — there is now a single acquisition site.

2. Centralize segment delete in BaseTableDataManager

Segment deletion belongs on the TDM, not on HelixInstanceDataManager. This change moves it there:

  • New SPI method: TableDataManager#deleteSegment(String).

3. Drop destroy()-without-offload() assertion in SegmentDataManager

destroy() already calls offload() first, so callers can simply do:

if (segmentDataManager.decreaseReferenceCount()) {
    segmentDataManager.destroy();
}

instead of the longer pattern that requires an explicit prior offload(). The previous assertion forced the longer pattern with no real benefit.

4. Split tryLoadExistingSegment so subclasses can load without registering

Introduces tryLoadExistingSegmentInternal which returns the ImmutableSegment (or null on miss/stale CRC/load failure) without invoking addSegment or other registration hooks. The public tryLoadExistingSegment is now a thin wrapper that calls the internal method and registers the result.

5. Test refactor for extensibility

BaseTableDataManagerAcquireSegmentTest is reworked into a reusable base so subclasses can run all existing test bodies against a different TableDataManager implementation. Exposed hooks:

  • newTableDataManager() — choose the concrete TDM under test.
  • tdmKey(segmentName) — map a logical segment name to the TDM-map key.
  • addSegment(tdm, seg), getInnerSegment(sdm) — small indirections so subclasses can vary segment registration / wrapping.

Adjacent visibility tweaks (isSegmentStaleprotected, hasSameCRCpublic) support the same goal.

SPI changes

  • New: TableDataManager#deleteSegment(String).

Test plan

  • Existing BaseTableDataManager and subclass tests pass.
  • Reload / segment-load / delete paths exercised in integration tests.

krishan1390 and others added 2 commits April 30, 2026 12:23
…ts building blocks

Three small refactors that keep single-segment behavior identical and let a
future multi-segment SegmentDataManager (e.g. a wrapper around N constituent
segments) reuse the same load and reload primitives without forking them:

1. Extract `protected ImmutableSegment loadSegment(zkMetadata, ilc)` from
   `downloadAndLoadSegment`. The new helper performs only the download +
   `ImmutableSegmentLoader.load`, returning the segment without registering it
   in `_segmentDataManagerMap` or invoking upsert hooks. Single-segment callers
   continue to use `downloadAndLoadSegment`, which now composes the helper +
   `addSegment(...)`. This lets a multi-segment manager load all of its members
   first and register a single wrapper entry under one name.

2. Push `_segmentReloadSemaphore` acquire/release down into
   `reloadSegment(SegmentDataManager, IndexLoadingConfig, boolean)`. The public
   `reloadSegment(String)` and the private parallel `reloadSegments(List<SDM>)`
   both used to wrap the inner call with the semaphore; that acquire is now
   inside the per-physical-segment body and the outer wrappers are removed
   (which would otherwise double-acquire on a non-reentrant semaphore). For
   non-group tables this is structurally identical (one segment -> one acquire
   -> one release; same concurrency bound). For multi-segment managers that
   fan out N reloads, each member contends for a slot independently.

3. Drop `@VisibleForTesting` on `isSegmentStale(IndexLoadingConfig,
   SegmentDataManager)` and widen to plain `protected` so subclasses can call
   it from group-aware overrides of `getStaleSegments` / `needReloadSegments`.

The semaphore stays at the orchestration boundary in `doReplaceSegment` (not
relocated into `replaceSegmentIfCrcMismatch`), because subclasses commonly
override `replaceSegmentIfCrcMismatch` and a relocation there would leak the
acquire across paths that bypass the override; subclasses needing per-member
acquire on a multi-segment replace can wrap the call themselves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 57.89474% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.67%. Comparing base (bd76ff0) to head (25af3f2).

Files with missing lines Patch % Lines
.../pinot/core/data/manager/BaseTableDataManager.java 63.46% 17 Missing and 2 partials ⚠️
...server/starter/helix/HelixInstanceDataManager.java 0.00% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18381      +/-   ##
============================================
- Coverage     63.68%   63.67%   -0.01%     
  Complexity     1685     1685              
============================================
  Files          3266     3266              
  Lines        199821   199849      +28     
  Branches      31022    31024       +2     
============================================
+ Hits         127257   127258       +1     
- Misses        62425    62450      +25     
- Partials      10139    10141       +2     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.67% <57.89%> (-0.01%) ⬇️
temurin 63.67% <57.89%> (-0.01%) ⬇️
unittests 63.67% <57.89%> (-0.01%) ⬇️
unittests1 55.81% <21.15%> (-0.02%) ⬇️
unittests2 34.95% <43.85%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@krishan1390 krishan1390 marked this pull request as draft May 4, 2026 12:23
@krishan1390 krishan1390 changed the title Refactor BaseTableDataManager so multi-segment managers can compose its building blocks Refactor BaseTableDataManager to enable cleaner extensions, abstractions and ownership for future work May 6, 2026
@krishan1390 krishan1390 changed the title Refactor BaseTableDataManager to enable cleaner extensions, abstractions and ownership for future work Refactor BaseTableDataManager May 6, 2026
}

@Override
public void deleteSegment(String segmentName)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved from HelixInstanceDataManager to TableDataManager as this is the right place

} finally {
_segmentReloadSemaphore.release();
}
reloadSegment(segmentDataManager, indexLoadingConfig, forceDownload);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the sempahore to inside reloadSegment

*/
public void destroy() {
// NOTE: We want the test to catch the case when destroy is called without offloading, but not fail the production.
assert _offloaded.get() : "Cannot destroy segment data manager without offloading it first";
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense for destroy to call offload first if we want the segment to be offloaded only after its reference count becomes 0

Also I don't see a reason to complicate clients to do below loop always

segmentDataManager.offload();
if (segmentDataManager.decreaseReferenceCount()) {
      segmentDataManager.destroy();
}

Rather below is simpler

if (segmentDataManager.decreaseReferenceCount()) {
      segmentDataManager.destroy();
}

Thus removing this assertion

private final Set<SegmentDataManager> _accessedSegManagers = ConcurrentHashMap.newKeySet();
private final Set<SegmentDataManager> _allSegManagers = ConcurrentHashMap.newKeySet();
private Map<String, ImmutableSegmentDataManager> _internalSegMap;
protected Map<String, SegmentDataManager> _internalSegMap;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are refactoring to enable a more generic and extensible BaseTableDataManagerTest

} else {
_logger.info("Reloading (force committing) consuming segment: {}", segmentName);
((RealtimeSegmentDataManager) segmentDataManager).forceCommit();
_segmentReloadSemaphore.acquire(segmentName, _logger);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we were capturing the semaphore earlier too for force commit, we're doing it now too.

@krishan1390 krishan1390 marked this pull request as ready for review May 13, 2026 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants